AITopics | inductive learning

Collaborating Authors

inductive learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Prototype Language Models

Ley, Dan, Nguyen, Giang, Lakkaraju, Himabindu, Adebayo, Julius

arXiv.org Machine LearningJul-2-2026

Knowing which training examples drive outputs is fundamental to auditing, correcting, and understanding language models, yet for modern LLMs this remains expensive, approximate, and largely post-hoc. Standard language models generate tokens through a dense network pathway, causing training data's influence to be distributed across parameters rather than organized along explicit, traceable components. We introduce a prototype language model architecture, Prototypes for Interpretable Sequence Modeling (PRISM), that forms each prediction via a sparse, non-negative mixture of learned prototypes, trained with clustering objectives that anchor each prototype to coherent neighborhoods of training examples. Across architectures from 130M to 1.6B parameters trained on up to 50B tokens, prototype language models either surpass or remain within 2.5 percentage points on average downstream accuracy of matched dense baselines. We show that sparse prototype structure localizes curvature in the loss landscape, yielding a more tractable Hessian and enabling training data attribution that is ~500x faster than post hoc baselines when consuming equivalent memory. Calibrating linear prototype controllers can improve downstream accuracy by roughly 3 points while tracing those corrections back to training neighborhoods, and targeted prototype suppression can remove model behaviors without finetuning or measurable loss in generation quality.

large language model, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

2607.0051

Country:

Europe (1.00)
North America > United States (0.67)
Asia (0.67)

Genre: Research Report > New Finding (0.45)

Industry: Information Technology > Security & Privacy (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Keep It on a Leash: Controllable Pseudo-label Generation Towards Realistic Long-Tailed Semi-Supervised Learning

Neural Information Processing SystemsJun-23-2026, 11:27:35 GMT

Current long-tailed semi-supervised learning methods assume that labeled data exhibit a long-tailed distribution, and unlabeled data adhere to a typical predefined distribution (i.e., long-tailed, uniform, or inverse long-tailed). However, the distribution of the unlabeled data is generally unknown and may follow an arbitrary distribution. To tackle this challenge, we propose a Controllable Pseudo-label Generation (CPG) framework, expanding the labeled dataset with the progressively identified reliable pseudo-labels from the unlabeled dataset and training the model on the updated labeled dataset with a known distribution, making it unaffected by the unlabeled data distribution. Specifically, CPG operates through a controllable self-reinforcing optimization cycle: (i) at each training step, our dynamic controllable filtering mechanism selectively incorporates reliable pseudo-labels from the unlabeled dataset into the labeled dataset, ensuring that the updated labeled dataset follows a known distribution; (ii) we then construct a Bayes-optimal classifier using logit adjustment based on the updated labeled data distribution; (iii) this improved classifier subsequently helps identify more reliable pseudo-labels in the next training step. We further theoretically prove that this optimization cycle can significantly reduce the generalization error under some conditions. Additionally, we propose a class-aware adaptive augmentation module to further improve the representation of minority classes, and an auxiliary branch to maximize data utilization by leveraging all labeled and unlabeled samples. Comprehensive evaluations on various commonly used benchmark datasets show that CPG achieves consistent improvements, surpassing state-of-the-art methods by up to 15.97% in accuracy. The code is available at https://github.com/yaxinhou/CPG.

artificial intelligence, inductive learning, machine learning, (19 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Accelerating data-driven algorithm selection for combinatorial partitioning problems

Neural Information Processing SystemsJun-23-2026, 08:13:11 GMT

Data-driven algorithm selection is a powerful approach for choosing effective heuristics for computational problems. It operates by evaluating a set of candidate algorithms on a collection of representative training instances and selecting the one with the best empirical performance. However, running each algorithm on every training instance is computationally expensive, making scalability a central challenge. In practice, a common workaround is to evaluate algorithms on smaller proxy instances derived from the original inputs. However, this practice has remained largely ad hoc and lacked theoretical grounding. We provide the first theoretical foundations for this practice by formalizing the notion of size generalization: predicting an algorithm's performance on a large instance by evaluating it on a smaller, representative instance, subsampled from the original instance. We provide size generalization guarantees for three widely used clustering algorithms (single-linkage, k-means++, and Gonzalez's k-centers heuristic) and two canonical max-cut algorithms (Goemans-Williamson and Greedy). We characterize the subsample size sufficient to ensure that performance on the subsample reflects performance on the full instance, and our experiments support these findings.

artificial intelligence, inductive learning, machine learning, (18 more...)

Neural Information Processing Systems

Country: Europe (0.27)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

An Effective Levelling Paradigm for Unlabeled Scenarios

Neural Information Processing SystemsJun-23-2026, 07:15:34 GMT

Advancements in direct-integration fine-tuning frameworks have underscored their potential to enhance the performance of labeled scenarios and tasks. To enhance the generalization of different categories in the same dataset, some methods have added visual loss to these frameworks for unlabeled scenarios. However, the performance of these methods through visual loss does not improve significantly in domain generalization and cross-dataset generalization tasks. This may be attributed to the uncoordinated learning of the two-modalities alignment and visual loss. To mitigate this issue of uncoordinated learning, we propose a novel method called Levelling Paradigm (LePa) to improve performance for unlabeled tasks or scenarios. The proposed LePa, designed as a plug-in module, dynamically constrains and coordinates multiple objective functions, thereby improving the generalization of these baseline methods. Comprehensive experiments have shown that our design can effectively address generalized scenarios and tasks.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area (0.76)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

ACloser Look to Positive-Unlabeled Learning from Fine-grained Perspectives: An Empirical Study

Neural Information Processing SystemsJun-23-2026, 03:37:46 GMT

Positive-Unlabeled (PU) learning refers to a specific weakly-supervised learning paradigm that induces a binary classifier with a few positive labeled instances and massive unlabeled instances. To handle this task, the community has proposed dozens of PU learning methods with various techniques, demonstrating strong potential. In this paper, we conduct a comprehensive study to investigate the basic characteristics of current PU learning methods. We organize them into two fundamental families of PU learning, including disambiguation-free empirical risks, which approximate the expected risk of supervised learning, and pseudo-labeling methods, which estimate pseudo-labels for unlabeled instances. First, we make an empirical analysis on disambiguation-free empirical risks such as uPU, nnPU, and DistPU, and suggest a novel risk-consistent set-aware empirical risk from the perspective of aggregate supervision. Second, we make an empirical analysis of pseudo-labeling methods to evaluate the potential of pseudo-label estimation techniques and widely applied generic tricks in PU learning. Finally, based on those empirical findings, we propose a general framework of PU learning by integrating the set-aware empirical risk with pseudo-labeling. Compared with existing PU learning methods, the proposed framework can be a practical benchmark in PU learning.

artificial intelligence, inductive learning, machine learning, (15 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.55)

Add feedback

Imbalances in Neurosymbolic Learning: Characterization and Mitigating Strategies

Neural Information Processing SystemsJun-22-2026, 23:24:21 GMT

We study one of the most popular problems in neurosymbolic learning (NSL), that of learning neural classifiers given only the result of applying a symbolic component σ to the gold labels of the elements of a vector x. The gold labels of the elements in x are unknown to the learner. We make multiple contributions, theoretical and practical, to address a problem that has not been studied so far in this context, that of characterizing and mitigating learning imbalances, i.e., major differences in the errors that occur when classifying instances of different classes (aka class-specific risks). Our theoretical reveals a unique phenomenon: that σ can greatly impact learning imbalances. This result sharply contrasts with previous research on supervised and weakly supervised learning, which only studies learning imbalances under data imbalances. On the practical side, we introduce a technique for estimating the marginal of the hidden gold labels using weakly supervised data. Then, we introduce algorithms that mitigate imbalances at training and testing time by treating the marginal of the hidden labels as a constraint. We demonstrate the effectiveness of our techniques using strong baselines from NSL and long-tailed learning, suggesting performance improvements of up to 14%.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Media (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

The Power of Iterative Filtering for Supervised Learning with (Heavy) Contamination

Neural Information Processing SystemsJun-22-2026, 23:18:23 GMT

Inspired by recent work on learning with distribution shift, we give a general outlier removal algorithm called iterative polynomial filtering and show a number of striking applications for supervised learning with contamination: (1) We show that any function class that can be approximated by low-degree polynomials with respect to a hypercontractive distribution can be efficiently learned under bounded contamination (also known as nasty noise). This is a surprising resolution to a longstanding gap between the complexity of agnostic learning and learning with contamination, as it was widely believed that low-degree approximators only implied tolerance to label noise.

artificial intelligence, inductive learning, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.70)
(2 more...)

Add feedback

Neural Rule Lists: Learning Discretizations, Rules, and Order in One Go

Neural Information Processing SystemsJun-22-2026, 22:34:22 GMT

Interpretable machine learning is essential in high-stakes domains like healthcare. Rule lists are a popular choice due to their transparency and accuracy, but learning them effectively remains a challenge. Existing methods require feature pre-discretization, constrain rule complexity or ordering, or struggle to scale. We present NEURULES, a novel end-to-end framework that overcomes these limitations. At its core, NEURULES transforms the inherently combinatorial task of rule list learning into a differentiable optimization problem, enabling gradient-based learning. It simultaneously discovers feature conditions, assembles them into conjunctive rules, and determines their order--without pre-processing or manual constraints. A key contribution here is a gradient shaping technique that steers learning toward sparse rules with strong predictive performance. To produce ordered lists, we introduce a differentiable relaxation that, through simulated annealing, converges to a strict rule list. Extensive experiments show that NEURULES consistently outperforms combinatorial and neural baselines on binary as well as multi-class classification tasks across a wide range of datasets.

artificial intelligence, machine learning, predicate, (20 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.66)

Industry:

Health & Medicine > Therapeutic Area (0.70)
Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.70)

Add feedback

CG-SSL: Concept-Guided Self-Supervised Learning

Neural Information Processing SystemsJun-22-2026, 21:26:11 GMT

Humans understand visual scenes by first capturing a global impression and then refining this understanding into distinct, object-like components. Inspired by this process, we introduce Concept-Guided Self-Supervised Learning (CG-SSL), a novel framework that brings structure and interpretability to representation learning through a curriculum of three training phases: (1) global scene encoding, (2) discovery of visual concepts via tokenised cross-attention, and (3) alignment of these concepts across views. Unlike traditional SSL methods, which simply enforce similarity between multiple augmented views of the same image, CG-SSL accounts for the fact that these views may highlight different parts of an object or scene. To address this, our method establishes explicit correspondences between views and aligns the representations of meaningful image regions. At its core, CG-SSL augments standard SSL with a lightweight decoder that learns and refines concept tokens via cross-attention with patch features. The concept tokens are trained using masked concept distillation and a feature-space reconstruction objective. A final alignment stage enforces view consistency by geometrically matching concept regions under heavy augmentation, enabling more compact, robust, and disentangled representations of scene regions. Across multiple backbone sizes, CGSSL achieves state-of-the-art results on image segmentation benchmarks using kNN and linear probes, substantially outperforming prior methods and approaching, or even surpassing, the performance of leading SSL models trained on over 100 more data. Code and pretrained models will be released.

artificial intelligence, machine learning, representation, (16 more...)

Neural Information Processing Systems

Country: Asia > Middle East > UAE (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
(2 more...)

Add feedback

QiMeng-MuPa: Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Neural Information Processing SystemsJun-22-2026, 21:13:03 GMT

The rise of GPU-based high-performance computing (HPC) has driven the widespread adoption of parallel programming models such as CUDA. Yet, the inherent complexity of parallel programming creates a demand for the automated sequential-to-parallel approaches. However, data scarcity poses a significant challenge for machine learning-based sequential-to-parallel code translation. Although recent back-translation methods show promise, they still fail to ensure functional equivalence in the translated code. In this paper, we propose QiMeng-MuPa, a novel Mutual-Supervised Learning framework for Sequential-to-Parallel code translation, to address the functional equivalence issue.

large language model, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country: